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Abstract 

We develop a methodology for the construction of a Hessian representation of Monte Carlo 
sets of parton distributions, based on the use of a subset of the Monte Carlo PDF replicas 
as an unbiased linear basis, and of a genetic algorithm for the determination of the optimal 
basis. We validate the methodology by first showing that it faithfully reproduces a native 
Monte Carlo PDF set (NNPDF3.0), and then, that if applied to Hessian PDF set (MMHT14) 
which was transformed into a Monte Carlo set, it gives back the starting PDFs with minimal 
information loss. We then show that, when applied to a large Monte Carlo PDF set obtained 
as combination of several underlying sets, the methodology leads to a Hessian representation 
in terms of a rather smaller set of parameters (MC-H PDFs), thereby providing an alternative 
implementation of the recently suggested Meta-PDF idea and a Hessian version of the recently 
suggested PDF compression algorithm (CMC-PDFs). The mc2hessian conversion code is made 
publicly available together with (through LHAPDF6) a Hessian representations of the NNPDF3.0 
set, and the MC-H PDF set. 
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1 Introduction 

The reliable treatment of uncertainties on the Parton Distributions (PDFs) of the proton is 
currently an essential ingredient for LHC phenomenology (see for example Refs. 0i for recent 
reviews). PDF uncertainties are of a peculiar nature, because they are uncertainties on a space 
of functions, and two main methods have been used to provide a representation of them; the 
Hessian method and the Monte Carlo (MC) method. 

In the Hessian method (currently used for instance in the MMHT14 and CTIO PDF 
sets), a parametrization based on a fixed functional form is introduced, and a multigaussian 
probability distribution is assumed in the space of parameters. Uncertainties are then given as 
the inverse of the covariance matrix of this multigaussian distribution. This is usually obtained, 
assuming linear error propagation and the least-squares method, as the Hessian matrix with 
respect to the parameters of a figure of merit (y^) at its minimum, which is viewed as the 
best-fit PDF. In the Monte Carlo method (currently used for instance in the NNPDF3.0 
PDF set) PDFs are delivered as an ensemble of replicas which provide a discrete (Monte Carlo) 
representation of the underlying probability distribution: uncertainties are then simply obtained 
as moments of this probability distribution. 

The Monte Carlo method has the twofold advantage that no Gaussian and linear error 
propagation assumption is necessary, and also, that PDFs can then parametrized with a general- 
purpose functional form with a large number of parameters (such as neural networks), for which 
the least-squares method would fail. The Hessian method, on the other hand, has the advantage 
that the (orthogonal) eigenvectors of the Hessian matrix may be treated as nuisance parameters. 
This is often a desirable feature when PDFs are used in experimental analysis, because other 
sources of uncertainty are also represented as nuisance parameters, and also because through 
standard methods it is then possible to the determine a subset of nuisance parameters which 
is most important for a given cross-section or distribution. 

Whereas deviations from Gaussianity may be important in specific kinematic regions, espe¬ 
cially when limited experimental measurements are available and PDF uncertainties are driven 
by theoretical constraints (such as for example the large-x region, relevant for new physics 
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searches), in most cases, and specifically when PDF uncertainties are small and driven by abun¬ 
dant experimental data, the Gaussian approximation is reasonably accurate. This then raises 
the question of whether in such case, in which everything is Gaussian and the Hessian approx¬ 
imation is adequate, one could have the best of possible worlds: a Hessian representation with 
the associate advantages, but without having to give up the use of a general-purpose flexible 
functional form. 

It is the purpose of the present paper to achieve this goal. We will do this by using the 
MC replicas themselves as the basis of the linear representation of the original MG sample. 
Indeed, we will show that if replicas of a very large Monte Carlo set (Wep=1000 replicas) are 
represented as a linear combination of a subset of them, not only it is possible to achieve very 
good accuracy by using a much smaller subset of replicas as basis functions, but in fact there is 
an optimal number of basis replicas, in that the degeneracy of replicas is such that larger bases 
would no longer be linearly independent. It turns out that this optimal number is quite small, 
of the same order of magnitude as the typical number of Hessian eigenvectors for standard PDF 
sets such as MMHT14 or CTIO. All this is true if the basis replicas are suitably chosen, which 
we do using a genetic algorithm. We can then simply construct a Hessian representation in 
the space of these linear expansion coefficients, with essentially no information loss or further 
bias introduced in comparison to the starting Monte Carlo representation. It is thus possible to 
provide a faithful, unbiased Hessian representation of any Monte Carlo PDF set, such as those 
provided by NNPDF. 

It is interesting to observe that the inverse problem, namely the conversion of a Hessian PDF 
set into a Monte Carlo representation, has already been considered and solved [lOj . An important 
advantage of being able to provide a MC representation of Hessian sets is the construction of 
combined PDF sets, which incorporate the information contained in several individual sets, as 
required for instance for Higgs boson coupling extraction or New Physics searches at the LHC. 
Currently, the recommended procedure (the so-called PDF4LHC recommendation m HI) is 
to take an envelope, which has no clear statistical meaning. However, once converted into a 
Monte Carlo representation, PDFs based on a common dataset can be combined in a simple 
way m 10|[13| . In this context, also the problem discussed in this work, namely the conversion 
from Monte Carlo to Hessian, has also been handled in the so-called “Meta-PDF” approach [13] . 
In this approach, a functional form similar to those used in the MMHT14 and CT analyses, the 
“meta-parametrization”, is fitted to the combined Monte Carlo PDF set. This clearly achieves 
the same goal as the conversion considered here, but with the further usual bias that a choice 
of functional form entails. If appplied to a combined PDF set, the methodology presented here 
provides thus an unbiased alternative to the Meta-PDF method of Ref. 
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The paper is organized as follows. In Sect. we describe in the detail our methodology for the 
Monte Carlo to Hessian conversion. Then, in Sect.j^we first, apply our methodology to a native 
Monte Carlo set, NNPDF3.0, benchmark its accuracy, and show that we end up with an optimal 
number of Hessian eigenvectors of order of a hundred. We then apply the methodology to a 
Monte Carlo set obtained by applying the Watt-Thorne method to a starting Hessian PDF 
set, MMHT14. This provides a closure test of the methodology: we can check explicitly that 
the starting set is reproduced very well. Finally, in Sect. |^we provide a Hessian representation 
of a Monte Carlo set obtained by combining several underlying PDF sets (either native Monte 
Carlo or converted to Monte Carlo from Hessian). We end up with a set of eigenvectors, the 
MC-H PDFs, which is of similar size of the compressed Monte Carlo PDF obtained recently [14| 
by applying compression algorithms to the large combined replica set, the so-called CMC-PDFs. 
Therefore, the PDF set which we obtain in this case provides an alternative to either the Meta- 
PDFs of Ref. 13 , of which it provides an unbiased version, or to the CMC-PDFs of Ref. |14j, 
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of which it provides a Hessian version. Details of PDF delivery in LHAPDF6 are presented in 
Sect. where conclusions are also drawn. In Appendix we discuss an alternative strategy to 
construct a Hessian representation of MC sets, which is used to validate our main methodology, 
and might turn out to be advantageous for future applications. 


2 Methodology 

As discussed in the introduction, the basic idea of our approach is to construct a linear rep¬ 
resentation for a set of Monte Carlo PDF replicas by expressing them as a linear combination 
of a small subset of them. Linearized error propagation, which is at the basis of the Hessian 
approach, can then be applied to the expansion coefficients, which immediately provide a rep¬ 
resentation of the Hessian matrix. It is important to observe that by “PDF replica” we mean 
the full set of PDFs at the parametrization scale, i.e., seven PDFs provided as a function of x 
for some fixed value, denoted in the following by Qq. These are all represented as a linear 
combination of the basis replicas with fixed coefficients, which thus do not depend on either the 
PDF or X value. Note that, because of the linearity of perturbative evolution, once a replica is 
expressed as a linear combination at the reference scale, all PDFs at all scales (including heavy 
flavors generated dynamically above the corresponding thresholds) are then given by the same 
linear combination of basis replicas. This in particular ensures that sum rules are automatically 
satished. 

We will first, describe how the Hessian matrix is constructed, and then, the optimization of 
parameters that characterize the procedure, specihcally the choice of basis replicas. 


2.1 Construction of the Hessian matrix 

We start assuming that we are given a prior set of PDFs represented as MC replicas {fa'^}k=i,...,Nrep 
where a = 1,..., A^pdf denotes the type of PDF, i.e. A^pdf = 2AIj -|- 1: Nf quarks and antiquarks 
and the gluon. In order to simplify the notation, we drop the explicit dependence of the PDFs 
on X and Q^. The central idea of our strategy consists of finding a subset of replicas, denoted 
by {^o^}i=i,...,Afeig such that any replica of the prior set, can be represented as a 

linear combination 

N ■ 

4"^ ^ fnl = + E ’ k = l,..., Wep , (1) 

^=1 


where fa^ is the central (average) value of the prior MC set; are constant coefficients, 

9 (k) 

independent of a, x and Q ; and denotes the new Hessian representation of the original 
(k) 

replica fa ■ Note that by construction the central value of the Hessian representation is the 
same as that of the original MC set. 

In order to determine the parameters we first define the covariance matrix in the space 

of PDFs for the prior set of replicas as 
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where the averages are performed over the original set of iVrep replicas. Then, we construct a 
2(k) 

figure of merit, Xpdf ■ 




Nf 


- E E 

01 , 13=1 




covP<^^) ^ • 

/ ij^OL^ 




(fc)/ 


( 3 ) 

Note in Eqs. (§ and Q the use of the the subscript “pdf”, to avoid any confusion with the 
covariance matrix and the in the space of experimental data, which do not play any role here. 

The optimal set of expansion coefficients for each of the original Nrep replicas is 

determined by minimization of Eq. Q. This is a convex problem which can be solved in an 
efficient way through Singular Value Decomposition (SVD) techniques. The problem consists of 
finding the vector a of dimension Veig that minimizes the residual of a linear system of dimensions 
{NxNf) X Veig for each replica of the original set. The PDE covariance matrix Eq. Q can be 
viewed as a {NxNf) x (NxNf) matrix cov^^^^, with indices l,m related to those of the original 
definition by I = Nx{a — 1) + i and m = Nx{/3 — 1) + j . Then we define an NxNp,^ x Veig 
matrix Y^q as 


^9 ~ > (^) 

with the same definition for the index m. We can now lay out the linear system by defining 


^Iq = 


E 

m=l 


cov 


pdf'! 


/ Im 


Yr 


mq ? 




E 

m=l 


COV 


/ tm 


( 5 ) 


again, with I = Nx{a — 1) + i. Here (cov^*^^) 2 stands for a square root of inverse covariance 
matrix, i.e., for a semi-positive definite real matrix A, the matrix such that (T 2 YA 2 = A. Einally 
we can recast the original problem Eq. (js ) as that of finding a that minimizes 11 Aa — 11. 

If the starting number of MC replicas is large enough, they will not all be linearly inde¬ 
pendent. In such case, if the number of eigenvectors Vgig is too large, the system will be 
over-determined and the solution will be degenerate in the space of linear expansion coefficients 
a. In these conditions, the correlations between this parameters will be ill-defined, and will 
result in a numerically unstable covariance matrix Eq. (§. On the other hand, if the number 
of eigenvectors Veig is too small, it will not be possible to achieve a small value of the figure of 
merit Eq. Q and the Hessian representation of the original covariance matrix will be a poor 
approximation. Therefore, on quite general grounds one expects that, if one starts with an 
extremely large (“infinite”) number of MC replicas, there will always be an optimal value of 

Neig. 

In Eq. ^ we have introduced a sampling in x, with a total of Nx points. This immediately 
raises the issue of choosing both a suitable spacing and range of the grid of points in x. Because 
PDEs are generally quite smooth, neighboring points in x are highly correlated, and thus the 
x-grid cannot be too fine-grained, otherwise the matrix cov^'^^ rapidly becomes ill-conditioned. 
Purthermore, the choice of the x-grid range must keep into account not only the fact that we want 
the replicas to be especially well-reproduced where they are accurately known (hence the grid 
should not be dominated by points in extrapolation regions), but also that the whole procedure 
is meaningful only if the starting probability distribution is at least approximately Gaussian. 


The way both issues are handled will be discussed in detail in Sect. |2.2| below. 

Having determined the expansion coefficients we obtain the eigenvector directions 
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which describe our original replica set by computing their covariance matrix: 


(a) _ 


N, 


COV. . = 


rep 


iVrep - 1 


(k) (k)\ 

< / 


rep 



i, j — 1, . . . , A^eig • 


( 6 ) 


This covariance matrix in the space of the linear expansion parameters Eq. should not be 
confused with the covariance matrix in the space of PDFs, defined in Eq. Q (hence the different 
superscripts). The Hessian matrix is then the inverse of cov^j \ which we can diagonalize through 
a rotation matrix Vij, thus obtaining a set of eigenvalues A* (as in the Meta-PDF method of 

Ref. [^). 

We thus obtain the one-sigma uncertainty band associated to each orthogonal direction by 
normalizing by \/Ai. Therefore the total one-sigma uncertainty will be given by 




Weig 

,E 

\ 




H 2 


^ {r]a\x, Q^) - fa\x, Q'^ 
i=l 


(7) 


and our final Hessian representation of the original Monte Carlo PDF set is composed by A^eig 
symmetric eigenvectors, given by 



For PDF sets obtained with this Hessian representation, one should use the symmetric Hessian 
formula, namely, the one-sigma PDF uncertainty will be given by 



(9) 


which is the practical recipe for Eq. Q. An analogous expression should be used for the 
computation of PDF uncertainties in physical cross-sections. 

If the method is successful, Eq. Q should be close to the original result for the one-sigma 
PDF uncertainty in the MC representation, namely 




( 10 ) 


Of course, once the Hessian representation is available, all the Hessian technology can be used, 
like the dataset diagonalization method or the computation of the correlation coefficients 
between different cross-sections 15 . Likewise, one can now easily include the orthogonal eigen¬ 


vectors in a nuisance parameter analysis. 


2.2 Optimization 

We discuss now the determination of the optimal set of parameters which characterize our 
procedure, and specifically: 

• the optimal grid of points in x over which the figure of merit Eq. © is evaluated; 

• the optimal number of eigenvectors A^eig and the optimal choice of the basis basis replicas. 
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Figure 1: Comparison of one-sigma and 68% confidence level intervals for some PDFs from the 

NNPDF3.0 NLO set, determined using a sample of N^ep = 1000 MC replicas, at Q = 4 GeV^. From top 
to bottom and from left to right the gluon, down, up and strange PDFs are shown. 


We consider in particular the application of our method to a prior set of Wep = 1000 MC 
replicas from the NNPDF3.0 NLO set. We then consider also Wep = 1000 MC replicas of the 
MMHT14 NLO set, constructed from the original Hessian representation using the Hessian to 
Monte Carlo conversion methodology of Ref. 
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A suitable choice for the grid of points in x is one that ensures that PDFs are well-reproduced 
in the kinematic region which is relevant for phenomenology, and that the spacing of the grid is 
snch that correlations between neighboring points are not so strong that it becomes impossible 
to invert the covariance matrix. In practice, we proceed as follows: we first consider an initial 
grid of points in x G [10“^, 0.9] with Nx = 50 for all PDFs (A^pdf = 7, since heavy flavor PDFs 
are generated dynamically), half equally spaced on a logarithmic scale for x G [10“^, 10“^] and 
half equally spaced on a linear scale for x G [0.1, 0.9]. We then determine the eigenvectors 
of the ensuing 350 x 350 {NxNp^i x Aa:A^pdf) covariance matrix, and discard all eigenvectors 
corresponding to eigenvalues whose size is smaller than a factor 10“^^ times the largest one. 
This removes points which carry little information due to large correlations. We then invert the 
covariance matrix in the remaining subspace. 

A further difficulty arises whenever the prior uncertainties are not Gaussian. In such case, 
a faithful Hessian representation is (by construction) impossible, and our procedure, which 
always leads to a final Hessian matrix, becomes meaningless. Whenever the starting PDF 
set has potentially non-Gaussian uncertainties, it is thus necessary to quantify the deviation 
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from gaussianity in order to make sure that the procedure can be consistently applied. We 
do this using the simplest indicator, namely the second moment of the probability distribution, 
specihcally comparing the one-sigma and 68% confidence level intervals , which for a Gaussian 
distribution coincide. The comparison is shown in Fig. [^for some PDFs in the NNPDF3.0 NLO 
set at = 4 GeV^. It is clear that deviations from gaussianity can be significant whenever 
experimental information is scarce or missing, specihcally at small- and large-x, since in these 
regions the PDF uncertainty is not determined by gaussianly distributed data but rather by 
extrapolation and by theoretical constraints (such as sum rules and cross-section positivity). 

We thus dehne the indicator 

where cra{xi, Qq) and Qq) are respectively the one-sigma and 68% conhdence level inter¬ 

vals for the a-th PDF at point Xi and scale Qq, computed from the original MG representation 
with Wep = 1000 MG replicas. When the prior set has potentially non-Gaussian uncertainties, 
hrst of all we evaluate the hgure of merit Eq. 01 on the same grid of point on which the covari¬ 
ance matrix is computed. We then discard all points for which the deviation Eq. © exceeds 
some threshold value e, i.e. we only include points such that 

ea{xi,Ql)<e. ( 12 ) 


We then proceed as above: on the remaining points we compute the covariance matrix, determine 
its eigenvectors, and discard eigenvectors whose size is less than twelve orders of magnitude 
smaller than the largest eigenvector. Needless to day, this additional initial step is not required 
for sets (such as MMHT14) which are obtained from a Monte Garlo conversion of an original 
Hessian set, and thus have Gaussian uncertainties by construction. 

We now turn to the determination of the optimal basis of replicas for the Hessian repre¬ 
sentation. This optimization requires the definition of a statistical estimator which measures 
the quality of the Hessian representation. Given that the Hessian representation corresponds 
to a Gaussian distribution, and that central values are reproduced by construction, the prob¬ 
ability distribution is fully determined by the covariance matrix. In practice, however, a good 
assessment of the quality of the Hessian representation is obtained by simply verifying that the 
diagonal elements of the covariance matrix are well well reproduced, thanks to the fact that 
correct correlations are automatically provided by the use of PDE replicas as a basis, as we shall 
explicitly verify below. 

We introduce therefore the estimator 


Afpdf 

ERF, = Y,Y. 


2=1 a=l 




cr^ 


{Xi,Qc 


(13) 


which compares the one-sigma standard deviations computed for the original MC and their 
Hessian representations, as given respectively by Eqs. (10) and ([^. We then compute the 
estimator for a given hxed value of A^eig basis replicas. The choice of these is random at first, 
and then optimized using a Genetic Algorithm (GA). 

where the related 


The parameters of the GA are chosen based on the studies of Ref. 14 


problem of optimizing the choice of PDE replica set was studied: we find that a single mutation 
per iteration of the GA is sufficient, with the number of mutants chosen to be between one and 
four per mutation, with probabilities listed in Table It turns out that = 2000 iterations 








mc2hessian vl.0.0 

Tvrmut 
^ ’rep 

-fmut 

1 

30% 

2 

30% 

3 

10% 

4 

30% 


Table 1: Number of mutants per replica and respective probabilities for each generation of the GA. 


of the GA are sufficient to obtain good stability and a sizable improvement of the figure of merit 
in comparison to the starting random selection. 

In Fig. we plot the value of the estimator Eq. (13) after GA minimization, as a function 
of the threshold value of e and the number iVeig of basis replicas. A valley of minima is clearly 
seen, and shown in the plot as a curve, determined by searching for the absolute minimum of 
the estimator as a function of Neig for each fixed e. 

An interesting feature of Fig.[^is that for all values of e the optimal value of A^eig is reasonably 
small, and much smaller than the total number of replicas N^ep = 1000. As one might expect, 
when the value of the threshold e is small, and thus a large number of points in x is excluded, 
the optimal number of eigenvectors is also small, rapidly decreasing when e < 0.2. Clearly, 
however, if e is too small, only few points will be retained in the computation of the figure of 
merit Eq. and the original MG replicas will not be reproduced accurately enough. 

In order to determine the optimal value of e, for each value of N^ig and e shown in Fig. 
we have recomputed the figure of merit Eq. (13) using the same eigenvector basis, but now 
including all points; the valley of minima is then determined again for this new surface. The 
dependence on e is now due to the fact that the eigenvector basis changes as e is varied, though 
the definition of the figure of merit does not. Therefore, the difference between the two curves is 
a measure of how much the exclusion of nongaussian points by the e criterion affects the choice 
of optimal eigenvector set. The two curves are compared in Fig. they are seen to diverge 
when e < 0.2. We take this as an indication that, below this value, the amount of information 


which is necessary in order to describe all points starts being significantly different from that 
which is sufficient for an accurate description of the points which have passed the cut, and thus 
the cut becomes too restrictive. We consequently adopt e = 0.25 (red curve), as a reasonable 
compromise between only including points for which uncertainties are Gaussian, and not loosing 
too much information. 

The profile of the figure of merit with the choice of threshold value e = 0.25, is shown in 
Fig. 0 It is seen that the optimal number of eigenvectors is A^eig = 120. The fact that this 
value is much less than the starting Wep = 1000 means that replicas in the original set are 
strongly correlated. This is nicely consistent with the result that it is possible to construct 
a “compressed” representation of NNPDF3.0, in which the original probability distribution is 
reproduced but including a much smaller, optimized set of Monte Garlo replicas [14] . In fact, it 
turns out that the optimal number of eigenvectors, and the number of compressed replicas, are 
of the same order of magnitude. 

In Fig. we also show the figure of merit when the basis replicas are chosen randomly, instead 
of being optimized through the GA. It is apparent that use of the GA leads to an improvement 
of the figure of merit by almost a factor two. It is interesting to observe that this improvement 
is in fact achieved by modifying only a small fraction of the initial random selection of replicas: 
specifically, only 26 of the replicas used as initial input for the basis are mutated at the end of 
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Figure 2: The figure of merit, Eq. (131, for the Hessian representation of the NNPDF3.0 NLO set, 
computed for points which satisfy the gaussianity criterion Eq. (12), plotted versus the threshold e 
Eq. (12) and the number of eigenvectors iVeig, after the choice of basis replicas has been optimized 
through a run of the GA with the settings of Table. The value of the estimator along the valley of 
minima, i.e. the curve determined by finding the value of fVeig at which the estimator has its absolute 
minimum for each e is shown (blue curve). The red curve marks the value e = 0.25 which is finally 
adopted. The projection of the valley of minima in the (e, 


, iVeig) plane is shown in Fig. 


the iterations of the GA. This suggests that there is still room for improvement in the 

selection of the optimal basis, since the GA only explores combinations that are not to far from 
the initial basis. 

Finally, we have repeated our construction for the Monte Garlo representation of the MMHT14 
NLO PDF set. In this case, because the starting PDF set is Hessian, no gaussianity requirements 
are necessary. On the other hand, because the underlying PDF set is described by a smaller 
number of parameters than either the NNPDF or the combined set considered previously, Monte 
Garlo replicas tend to be more correlated. As a consequence, it is necessary to relax somewhat 
the criterion for selection of the eigenvectors of the covariance matrix: in this case, we keep 
all eigenvectors corresponding to eigenvalues whose size is larger than 10“^® times that of the 
largest one (instead of 10“^^ as in the previous cases). We then simply determine the figure of 
merit as a function of Ngig: results are shown in Fig. Also in this case, the GA leads to an 
improvement of the figure of merit by nore than a factor two. Now, however, the optimal number 
of eigenvectors is rather smaller, Wig = 14, as compared to the NNPDF3.0 result. Again, this 
is of the same order as that which is used when applying the compression algorithm of Ref. 
to the MMHT14 PDFs. 
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Figure 3: The valley of minima shown in Fig. (blue curve) shown as a projection on the (e, iVeig) 
plane, compared to the curve recomputed using the same eigenvector basis but including all points in the 
determination of the figure of merit (black curve). The red line indicates the threshold value e = 0.25 
which is finally adopted. 


3 Results and validation 

We now study in detail the results which are obtained when applying our Monte Carlo to Hessian 
conversion to NNPDF3.0 NLO — a native Monte Carlo PDF set — and MMHT14 — a Hessian 
PDF set which has been turned into Monte Carlo using the technique of Ref. [10] , In the former 
case, we compare the final Hessian PDFs to the starting Monte Carlo ones, at the level of 
central values, uncertainties, and correlations, thereby validating the procedure. In the latter 
case, we compare the results of the final Hessian conversion to the original Hessian set, thereby 
providing a powerful closure test of the procedure. Finally, we compare results before and after 
Hessian conversion for both PDF sets at the level of physical observables, in order to ascertain 
the accuracy of our methodology for realistic applications. PDF comparisons and luminosity 
plots shown in this section have been produced using the APFEL Web plotting tool |17[|18|. 


3.1 Hessian representation of Monte Carlo PDFs 


We concentrate on the PDF set obtained starting with Wep = 1000 NNPDF3.0 NLO replicas 


and applying our methodology with the optimal choice of parameters discussed in Section 2.2 
namely e = 0.25, Wig = 120. In Fig. |^we compare the original Monte Carlo representation to 
the final Hessian representation of several PDFs at = 2 GeV^; the excellent accuracy of the 
Hessian representation is apparent, with differences in the one-sigma PDF uncertainty bands of 
the order 5% at most (recall that central values coincide by construction). In Fig. the same 
comparison is performed at = 10^ GeV^, now shown as a ratio to central values. 

In Fig. [^we then compare some parton luminosities, computed for proton-proton collisions 
with a center of mass energy of 13 TeV, plotted vs. the invariant mass of the final state. Results 
are shown normalized to the central value of the NNPDF3.0 NLO Monte Carlo set. Again, 
we find excellent agreement, except in the regions of very small or very large invariant masses 
(which respectively depend on small and large-x PDFs). This is unsurprising given that these 
are extrapolation regions in which the Gaussian approximation is less good. 
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Figure 4: The figure of merit of Fig. plotted versus A^eig for fixed e = 0.25 (blue curve), for the 
NNPDF3.0 NLO set. The optimal value iVeig = 120 is denoted by a vertical dash. Results obtained by 
not optimizing the basis through the GA are also shown (red curve). 


As mentioned in Sect. 2.2, even though the figure of merit Eq. (13) used for the GA only 


optimizes the diagonal elements of the covariance matrix, correlations are automatically repro¬ 
duced thanks to the use of the original replicas as a basis. We can check this explicitly. The 
correlation coefficient between two different PDFs fa and //j, at a given value of x, in the 
Monte Carlo representation is given by |19|: 


0/3 / ,->2', _ -W-ep 


PMci^,Q^) = 




Nrep - 1 


rep 


V 


O', 


PDF 


(X,Q2) . cj|“F(x,g2) 


(14) 

where averages are taken over the Wep = 1000 replicas of the sample, and aaix,Q‘^) and 
cj/ 3 (x, g^) are the standard deviations Eq. (10). In the Hessian representation the same quantity 
is given by 


Ph essian 


(x,g2) = 


E 3Veig 

k=l 


(IL^\x,Q^) - f^\x,Q^ 


- fPix,Q^ 




(15) 

where now the sum is performed over the Wig eigenvectors, and fa\f^^'^ are the respective 
central sets, which coincide with the MC average values. 

The correlation coefficients Eqs. (14 15) before and after Hessian conversion are compared 
in Eig. again, very good agreement is seen, with differences compatible with the uncertainty 
on the Monte Carlo representation. We have checked explicitly that a similar level of agreement 
is found at the level of correlations between a number of LHC cross-sections and differential 
distributions. 

We conclnde that at the level of second moments the Hessian representation of the starting 
Monte Carlo probability distribution is very accurate. Of course, to the extent that higher 
moments deviate from Gaussian behaviour they will be accordingly not so well reproduced. 
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Figure 5; Same as Fig. but for the MMHT14 MC NLO PDF set. In this case the gaussianity 
condition Eq. (121 is not applied. The optimal value A^eig = 14 is denoted by a vertical dash. 


3.2 A closure test: Hessian representation of Hessian PDFs 


We now consider the MMHT14 NLO PDF set. In this case, the Monte Carlo representation 
which is converted into Hessian is in turn obtained by starting from an initial Hessian represen¬ 
tation, using the methodology of Ref. . We can then compare the starting and final Hessian 
sets, thereby obtaining a closure test. This provides a powerful test of the basis-independence 
of our procedure, in that the starting Hessian is defined in the space of parameters of a specific 
functional form, which is then turned into Hessian by running a Monte Carlo in parameter space, 
while our final Hessian representation uses the ensuing Monte Carlo replicas as basis functions. 


Again, we adopt the optimal choice of parameters discussed in Section 2.2 namely Wig = 14 


Note that this was obtained by relazing somewhat the criterion for keeping eigenvectors of the 
covariance matrix, due to the greater correlation of MMHT PDF replicas: indeed, we have 
verified that use of the same criterion as for the sets we considered previously would need to 
a smaller optimal number of eigenvectors (Wig = 12 instead of Wig = 14), and a considerable 
loss of accuracy. 

The starting Hessian representation and our final Hessian conversion are compared in Fig. 10 
at = 2 GeV^. Again we hnd agreements of the uncertainty to better than 5%. It is interest¬ 
ing to observe that the original Hessian representation had Wig = 25 asymmetric eigenvectors 
(corresponding to 50 error sets), while our final Hessian conversion only needs Wig = 14 sym¬ 
metric eigenvectors. This means that our algorithm has managed to achieve a compression of 
the information in the native Hessian representation, thanks to the use of replicas as a basis, 
with minimal information loss. 


3.3 LHC phenomenology 

We finally validate our Hessian conversion at the level of physical observables: standard candle 
total cross-sections and differential distributions. For simplicity, we perform all comparisons at 
NLO, given that, clearly, the accuracy of the Hessian approximation is essentially independent 
of the perturbative order. 


In Fig. 11 we compare results obtained using the Monte Carlo and Hessian NNPDF3.0 
representations, and the original and hnal Hessian representation of MMHT14 PDFs, for the 
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Figure 6: Comparison between the starting Monte Carlo representation with TV^ep = 1000 replicas and 
the final Hessian representation with iVeig = 120 eigenvectors for the for the NNPDF3.0 NLO set with 
as = 0.118. From left to right and from top to bottom we the gluon, total quark singlet, total strangeness 
and the total valence are plotted vs. x for fixed = 2 GeV^. 


total cross-sections for Higgs production in gluon fusion obtained using the ggHiggs code pO] 


top quark pair production obtained with top++ [21], and inclusive W and Z production obtained 
with VRAP 


|22j . Results are always shown normalized to the value of the original Monte Carlo 
set. For NNPDF3.0, the agreement is very good, with PDF uncertainties consistent with 10% 
differences at most. Somewhat larger differences are found for MMHT14. 

We then compare several differential distributions, chosen among those which have been 
used in the NNPDF3.0 PDF determination, and for which fast interfaces are available, either 
APPLgrid FastNLO [24] or aMCfast 25 . Again, results are always shown normalized to the 


central value of either NNPDF3.0 NLO Monte Carlo set or the MMHT14 NLO native Hessian. 
In particular we consider: 


The ATLAS high-mass Drell-Yan measurement 26 , integrated over rapidity 


< 2 . 1 , 


and binned as a function of the di-lepton invariant mass pair Mu, 


the CMS double differential Drell-Yan measurement 27 in the low-mass region, 20 > 


Mu > 30 GeV, as a function of the di-lepton rapidity yn, 

The CMS W~^ lepton rapidity distribution p^ . 

The CMS measurement of W~^ production in association with charm quarks measure¬ 


ment 29 , as a function of the lepton rapidity yi, 
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NNPDF3.0 NLO, a3=0.118 @ = lO'* GeV^ NNPDF3.0 NLO, a3=0.118 @ = lO'* GeV^ 



Figure 7: 


Same as Fig. but at = 10^ GeV^, 


and with results normalized to the central PDF. 


• The ATLAS inclusive jet production measurement [30] in the central rapidity region, 
|?/jet| < 0.3, as a function of the iet pT, and 

• The same ATLAS inclusive jet production measurement now in the forward rapidity 
region, 4.0 < |yjet| < 4.4, as a function of the jet px- 


These observables probe a wide range of PDF combinations, from light quarks and anti-quarks 
(Drell-Yan) and strangeness {W + c) to the gluon (jets) in a wide range of Bjorken-x and 
momentum transfers Q^. 

Results are shown in Fig. for NNPDF, and in Fig. 


13 for MMHT14. Again, there is a 


good agreement between the original Monte Carlo and the new Hessian representations, with 
differences smaller than 10%. 


4 A Hessian representation of combined MC sets: MC-H PDFs 


As discussed in the introduction, the Monte Carlo representation of PDFs offers the possibility of 
constructing combined PDF sets which incorporate information from different PDF determina¬ 
tions, and thus provide an alternative m 10||13] to the current PDF4LHC recommendation 0 
for the combination of predictions obtained using different PDF sets, which is less than ideal 
from a statistical point of view. 

An obvious shortcoming of a combined Monte Carlo set is that it contains generally a large 
number of replicas, which can be cumbersome to handle, and which is computationally very 
intensive. This difficulty has been handled in Ref. |14] by developing a compression algorithm. 
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Figure 8: Same as Fig.[^ but now comparing parton luminosities for proton-proton collisions at 13 TeV, 
plotted vs. the invariant mass of the final state. Results are shown normalized to the central PDF. 


whereby the number of replicas in a Monte Carlo set is optimized by means of a GA without 
significant loss of information. This has enabled the construction of sets of less than A^rep = 50 
replicas which reproduce most of the information contained in a starting N^ep = 300 replica set. 

Combined Monte Carlo set are generally non-Gaussian even when obtained by combining 
individually Gaussian PDF sets. However, once again the Gaussian approximation may often 
be adequate in practice, and then a Hessian representation may be useful for applications as 
repeatedly mentioned. The so-called Meta-PDF method has been proposed as a way of 
dealing with this problem: it consists of re-fitting a fixed functional form to the final combined 
Monte Carlo set, and thus it has the usual shortcomings related to a fixed choice of functional 
form. We now show how by applying our Monte Carlo to Hessian conversion to a combined Monte 
Carlo set we directly obtained a Hessian representation with a small number of eigenvector, 
therefore obtaining a compressed Hessian representation, which we call MC-H PDFs. 

We start with a Monte Carlo combination of the NNPDF3.0, CT14 and MMHT14 NNLO 
PDF sets, with as{Mz) = 0.118. This is the starting point of the construction of the compressed 
sets of Ref. 14 , where further details are given, and it contains Wep = 300 replicasQ We could 
in principle then first, run the compression algorithm of Ref. (l4|, and then perform a Monte 


Carlo to Hessian conversion of the ensuing compressed set of replicas. However, each of these 
two steps entails potential information loss, and thus it is more advantageous to perform directly 
a Hessian conversion of the starting set of Wep = 300 MC replicas. 


^Note that the CT14 PDF set included in this combination is still preliminary. We thank Pavel Nadolsky for 
providing this preliminary version of CT14. 
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Figure 9: Correlation coefficients between pairs of PDFs at a common value of x and versus x for 
= 10^ GeV^ before and after Hessian conversion. 


We thus apply our Monte Carlo to Hessian conversion to the combined prior with Wep = 300, 
following the methodology presented in Sect.j^ It actually turns out that significant deviations 
from Gaussian behavior are observed for PDFs for which direct experimental information is 
scarce, and thus theoretical bias or constraints play some role, such as the strange PDF. Once a 
final combined set is made available for phenomenology, a choice will have to be made in order 
to decide whether a Hessian approximation is viable. For the time being, given the preliminary 
nature of the existing set, we simply choose e = 0.25 as in Sect. 2.2 as a threshold for discarding 
non-Gaussian points. We then end up with an optimal conversion with iVeig = 90 eigenvectors, 
somewhat larger though of the same order than the number of compressed replicas of the CMC 
set (Wep ~ 40). 

We have performed an extensive set of validation tests of these MC-H PDFs, at the level of 
PDFs, luminosities and physical observables, of which we now show some examples. In Fig. 14 we 


compare PDF luminosities at the LHC with ^/s =13 TeV computed with the starting combined 
MC set and the hnal MC-H Hessian set. We find good agreement, with differences below 10% at 
the level of PDF uncertainties. It is important to note that some disagreement is to be expected 
because the starting combined set is not Gaussian, in particular for regions of x (such as large 
and small x) and PDFs that are poorly constrained by experimental data; indeed, the largest 
discrepancies are observed at low and high invariant masses Mx- These differences thus signal 
an intrinsic limitation of the Hessian representation, rather than a failure of our methodology. 

The good agreement at the level of PDF luminosities translates into a good agreement at the 
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Figure 10: Same as Fig. but for MMHT14 NLO PDFs, with iVeig = 25 asymmetric eigenvectors in 
the starting Hessian set and = 14 symmetric eigenvectors in our final Hessian conversion. 


level of physical observables. In Fig. ^ we compare the processes that we discussed in Sect. 3.3 


(with the same settings), computed using the starting combined MC PDF set and the final 
MC-H Hessian representation. Again, results are normalized to the central value of the starting 
combined set, and uncertainties bands correspond to the one-sigma PDF uncertainty in each 
bin (recall that central values of the starting and hnal PDF sets are the same by construction). 
Again, discrepancies are below the 10% level. 

We concluded that the Hessian conversion algorithm presented in this paper also provides a 
successful methodology for the construction of a Hessian representation with a moderate number 
of eigenvectors of combined Monte Carlo PDF sets. Differences between the starting MC set 
and the final MC-H Hessian representation can be used, to a certain extent, to quantify the 
degree of non-gaussianity which is present in the original set. 


5 Delivery and outlook 

We have provided a general purpose methodology for the Hessian conversion of any Monte Carlo 
PDF set. When applied to a native Monte Carlo set, this methodology provides an efficient 
Hessian representation which is faithful to the extent that the starting set is Gaussian. When 
applied to a Monte Carlo set obtained from a starting Hessian, the methodology gives back the 
original set to very good accuracy, but using the Monte Carlo replicas as a basis. Finally, when 
applied to a combined Monte Carlo replica set it provides a Hessian version (MC-H PDFs) of 
the recently proposed PDF compression methodology (MC-PDFs [^), and an implementation 
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Figure 11: Comparison of NLO inclusive cross-sections at the LHC, computed using PDFs before and 
after Hessian conversion: for NNPDF3.0 (left) the Hessian representation is compared to the original 
Monte Carlo, while for MMHT14 (right) the final Hessian conversion is compared to the original Hessian. 


of the Meta-PDF idea [I^ which is free of the bias related to the choice of a specific functional 
form. 

The main deliverable of this work is the mc2hessian code, which easily allows for the con¬ 
struction of a Hessian representation of any given Monte Carlo PDF set. The mc2hessian 
code is written in Python using the numerical implementations provided by the NumPy/Scipy 
packages . This code is publicly available from the GitHub repository 

https://github.com/scarrazza/mc2hessian 

and outputs results directly in the LHAPDF6 format [32] , so that the new Hessian sets can 
be easily interfaced by any other code. However, it should be kept in mind that the Hessian 
representation always requires careful validation, as some information loss is necessarily involved 
in this transformation, and specifically any deviation from Gaussianity is inevitably washed out. 

The Hessian version of the NNPDF3.0 sets 

NNPDF30_nlo_as_0118_hessian 

NNPDF30_nnlo_as_0118_hessicin 


as well as the MC-H PDFs 


MCH_nlo_as_0118Jiessian 
MCH_nnlo_as_0118 Jiessian 


will be made available in LHAPDF6. Future NNPDF releases will be provided both in the native 
Monte Carlo and in the new Hessian representations. 

An interesting development of the methodology suggested here is that an unbiased Hessian 
representation could be used as a way to single out the PDF flavours (and x-ranges) that provide 
the dominant contribution to individual physics process, by picking the dominant eigenvectors; 
along the lines of previous suggestions [9,33, but in a parametrization-independent way. This 


could then be used to construct tailored sets with a small number of eigenvectors which, though 
not suitable for general-purposes studies, could be useful for experimental profiling when re¬ 
stricted to a small subset of relevant processes. These and related issues will be the subject of 
future work. 
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Figure 12: Comparison of the original Monte Carlo representation and the new Hessian representation of 
NNPDF3.0 NLO for a number of differential distributions at the LHC 7 TeV. The error band corresponds 
to the one-sigma PDF uncertainty in each bin. Results are shown normalized to the central value of the 
NNPDF3.0 NLO Monte Carlo set. See text for more details. 
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Figure 13: Same as Fig. 12 but for MMHT14 NLO PDFs. The final Hessian set obtained after 
conversion of the Monte Carlo replicas is compared to the starting Hessian representation. 


A Hessian representation through Singular Value Decomposi¬ 
tion 

The solution to the problem of finding a suitable Hessian representation of MC PDF sets is not 
unique. The main strength of the approach we have explored here is that a representation in 
terms of the original MC replicas automatically inherits a number of useful properties of the 
replicas, such as, for instance, the fact that sum rules are automatically satisfied and correlations 
are well reproduced, as discussed in Sect. |3.1[ This suggests an alternative approach, in which 
instead of representing all starting replicas on a subset of them, we pick the dominant combi¬ 
nation of all replicas through Singular Value Decomposition (SVD). This alternative method is 
briefly discussed in this Appendix. So far we have used it for validation of our main methodol¬ 
ogy, though it might be especially suitable for future applications, such as the construction of 
reduced eigenvector sets for specific physical processes. 

Since the replicas of a MC set are continuous functions with finite correlation length, the 
general problem of finding a Hessian representation of a MC set can be interpreted as that of 
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Figure 14: Same as Fig. but now comparing a combined MC PDF set with with iVi-ep = 300 replicas 
to its MC-H Hessian representation with iVeig = 90 eigenvector. The starting set is obtained from the 
combination of Monte Carlo representations of the NNPDF3.0, CT14 and MMHT14 NNLO PDF sets 
containing N^ep = 100 replicas each. 


finding a representation of a discrete covariance matrix of the form Eq. (§: the sampling in the 
space of X needs only to be fine grained enough that differences between neighboring points are 
non-negligible. 

Such representation can be directly constructed in terms of Monte Carlo replicas, in the 
following way. We define the rectangular (N^Nf) x Wep matrix 

Xik = fL^\x„Qo)-fi^\x„Qo), (16) 


where we adopted the same convention for indices as in Eq. (§: a labels PDFs, i points in the 
X grid, I = Nx{a — 1) + i runs over all NxNf grid points, and k runs over all MC replicas. The 
covariance matrix Eq. © is equal to 


cov 


pdf 


1 

A^rep - 1 


XXK 


(17) 


A diagonal representation of the covariance matrix in terms of replicas is found by SVD of 
the matrix X. Namely, we can write X as: 


X = USV^ . 


(18) 


Assuming that N^^^Nx < Arepi U is an orthogonal matrix of dimensions Np^ifNx x Wep which 
contains the orthogonal eigenvectors of the covariance matrix with nonzero eigenvalues; S is 
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Figure 15: Same as Fig. 12 but now for the PDF sets of Fig. 14 


a diagonal matrix of real positive elements, constructed out of the singular values of X, i.e. 
the square roots of the nonzero eigenvalues of cov^*^^ multiplied by the normalization constant 
(A^rep “1)2; and V is an orthogonal Ai^ep x A^rep matrix of coefficients. 

Because 

XX* = US^U* = {US){US)* , (19) 

the matrix 

Z = US (20) 


has the property that 


ZZ* = XX*. 


( 21 ) 


But also, 

z = xy 


( 22 ) 


and thus Z provides the sought-for representation of the covariance matrix as a linear combina¬ 
tion of MC replicas. 
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Figure 16: Same as Fig. but now with the Hessian representation constructed using the method 
discussed in this Appendix. 


We have thus arrived at an exact Hessian representation of the covariance matrix in terms 
of replicas, but with the disadvantage that the number of Hessian eigenvectors parameters is 
now equal to = A^pdfiVi, which is generally large. However, in practice many of these 
eigenvectors will lead to a very small contribution to the covariance matrix. We can then select 
a smaller set of A^eig < eigenvectors which still provides a good approximation to the 

covariance matrix. 

A possible strategy is, for example, to select the N^ig eigenvectors with largest singular 
values. Denoting with u, s, and v the Np^fNx x Aeig, -Wig x Wig and Wig x Wep reduced 
matrices computed using these eigenvalues, for a given value of Wig) using v instead of V in 


Eq. (22) minimizes the difference between the original and reduced covariance matrix 


WUS^U^ -us\* 


(23) 


We have verihed that the method described in this Appendix provides comparable results 
to those of the main strategy discussed in the paper. To this purpose, we have selected the set 


of Wig = 120 eigenvectors that minimize the figure of merit Eq. (23). To illustrate the quality 


of the new method in Fig. 16 we show a comparison of PDF luminosities, analogous to Fig. 

but now 


and in Fig. 17 a comparison of LHC differential distributions, analogous to Fig. 
obtained with the new method. Is clear that the agreement is comparab le as that found with 
the previous method. 


The main shortcoming of this method is that minimizing Eq. (23) is not necessarily the best 


strategy for the construction of an optimal general-purpose eigenvector set, in that it does not 
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Figure 17: Same as Fig. 12 but now with the Hessian representation constructed using the method 
discussed in this Appendix. 


always lead to also minimizing the figure of merit Eq. (13) which was shown in Sect. 2.2 to be 
physically advantageous. Some tuning of the methodology would thus be required. 
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